Abstract:Drug synergy prediction (DSP) aims to identify efficacious drug combinations under various cellular contexts with different targets. However, the continual emergence of novel compounds results in variations in molecular scaffolds and sizes, causing drug synergy data to exhibit out-of-distribution (O.O.D.) shifts with respect to topological structure. Existing works rely on in-distribution (I.D.) assumption, failing to handle the O.O.D. shifts. To solve this problem, we study out-of-distribution generalized drug synergy prediction through a graph large language model for the first time. Nevertheless, O.O.D. generalized DSP is highly non-trivial, posing several challenges: i) how to discover structurally relevant and irrelevant molecular representations with respect to cell targets; ii) how to find the optimal graph neural architectures that accurately calculate molecular representations; and iii) how to jointly leverage molecular structural and semantic information in LLMs. To address these challenges, we propose OOD-GraphLLM, a novel graphLLM framework which is able to accurately predict drug synergy under O.O.D. settings via jointly optimizing molecular graph representation and biomedical semantic language representations in a unified manner. Furthermore, we finetune DrugSyn-LLM, a biomedical LLM, and employ a retrieval-augmented biomedical instruction tuning strategy to align molecular topological information and molecular semantic information with language-based reasoning for O.O.D. generalized DSP. Both the source code (https://github.com/EkkoXiao/Bio-GraphLLM) and released model (https://mn.cs.tsinghua.edu.cn/bio-graphllm/) are publicly available, where users are allowed to download model resources and interactively use the system through a web interface.
Abstract:Multi-modal graphs, which integrate diverse multi-modal features and relations, are ubiquitous in real-world applications. However, existing multi-modal graph learning methods are typically trained from scratch for specific graph data and tasks, failing to generalize across various multi-modal graph data and tasks. To bridge this gap, we explore the potential of Multi-modal Graph Large Language Models (MG-LLM) to unify and generalize across diverse multi-modal graph data and tasks. We propose a unified framework of multi-modal graph data, task, and model, discovering the inherent multi-granularity and multi-scale characteristics in multi-modal graphs. Specifically, we present five key desired characteristics for MG-LLM: 1) unified space for multi-modal structures and attributes, 2) capability of handling diverse multi-modal graph tasks, 3) multi-modal graph in-context learning, 4) multi-modal graph interaction with natural language, and 5) multi-modal graph reasoning. We then elaborate on the key challenges, review related works, and highlight promising future research directions towards realizing these ambitious characteristics. Finally, we summarize existing multi-modal graph datasets pertinent for model training. We believe this paper can contribute to the ongoing advancement of the research towards MG-LLM for generalization across multi-modal graph data and tasks.